# **REPORT ON OPENSTA**

By: Prateek Kumar 2021181 Btech - ECE

### 1. INTRODUCTION

OpenSTA is an open-source tool used for static timing analysis of digital designs. It analyzes the timing of the design by considering the delays of various components and interconnections. OpenSTA supports industry-standard formats like Liberty (.lib) for library characterization and Synopsys Design Constraint (SDC) for input constraints.

The main theme of this report is to analyze the delay calculation of OpenSTA and to find out methods to use a custom delay calculation model in OpenSTA instead of using their own delay calculation model.

NOTE: This report is based on STA version 2.4.0

# 2. INITIATING THE SOFTWARE

The invoke command for OpenSTA is in the 'app' folder inside OpenSTA. To initialize the software we must have the corresponding netlist using the .v file, the liberty file, and the sdc constraints file .The commands can be written directly into the OpenSTa software or can be given to the software using a tcl file.

# 3. MAIN FUNCTION

The main function for the software is located in the 'app' folder by the name of Main.cc. This function is run when the invoke command is first called. Let's analyze this function. The main function has various conditions:

- a. If Argc=2 and the second argument is "-help" the code goes to the function showUsage which takes in the first argument as the invoke command or Argv[0] and the second argument as init\_filename which is predefined.
- b. Else If Argc=2 and the second argument is "-version" the code prints the value of STA\_VERSION ie the current sta version being used (2.4.0).
- c. Else the function sets Argc=1 and passes 1, Argv, and tclAppInit as parameters to the Tcl Main

If no errors are found during tcl initialization then the function initStaApp runs

 This function calls another function initSta() which is located at OpenSTA/search/Sta.cc:223

- There it initializes a func initElapsedTime() which is stored in MachineLinux.cc and stores the current time in the variable &elapsed\_begin\_time\_
- Next the initSta() function initializes the TimingRole class located at OpenSTA/liberty/TimingRole.cc:53
- Various new objects are created while initializing Timingrole:

#### wire

\_,combinational\_,tristate\_enable\_,tristate\_disable\_,reg\_clk\_q\_,reg\_set\_clr\_,latch\_en\_q\_,latch\_d\_q\_,sdf\_iopath\_,setup\_,hold\_,recovery\_,removal\_,width\_,period\_,skew\_,nochange\_,output\_setup\_,output\_hold\_,gated\_clk\_hold\_,latch\_setup\_,latch\_hold\_,data\_check\_setup\_,data\_check\_hold\_,non\_seq\_setup\_,non\_seq\_hold\_

\*\*\*We have basically initiated the constraints which are there in sta.TimingRoles in OpenSTA define the roles or responsibilities of specific objects or elements in the design regarding timing analysis, and they play a crucial role in determining and enforcing timing constraints during the analysis process.

- Next, we initiate the PortDirection by coming back to the initsta() func in sta.cc Takes me to the address OpenSTA/network/PortDirection.cc:32
- Here new objects are initialized of type PortDirection like input\_,output\_,tristate\_,bidirect\_,internal\_,ground\_,power\_ and unknown\_ These are for checking the type of Port in the circuit.
- Now the next function is run in initsta(): initTmpStrings();
   Location: OpenSTA/util/StringUtil.cc:163
- Used to initialize 100 temporary strings of size 100 each. while initializing sta
- Next function in initsta(): initLiberty(). Location:OpenSTA/liberty/Liberty.cc:48
- Initialising liberty takes me the initiation of TimingArcSet which is located in OpenSTA/liberty/TimingArc.cc:511
- Initial configuration of timing arc is being set here
- Now the initSta initiates the delay constants located at OpenSTA/graph/DelayFloat.cc:33
- Initialises min and max delay constant values to 0
- Next function : registerDelayCalcs ()
- Location : OpenSTA/dcalc/DelayCalc.cc:35

# 4. PROCESSING A LIBERTY FILE INTO OPENSTA

The data from the liberty file is taken and stored in openSTA in its own network format. The steps of reading this network from a .lib file are:

- TCL MAIN initiated from the main.cc file. This initiates the TCL interpreter
- 2. The tcl interpreter takes the source tcl file as input from the sourceTclFile from StaMain.cc which takes the file from Main.cc
- 3. The tcl interpreter starts evaluating the file and uses the file StaAppTCL\_wrap.cxx which is a mapping between tcl functions and opensta functions.

- 4. In the wrapper file the \_wrap\_read\_liberty\_cmd is initiated
- 5. The read\_liberty wrapper takes us to the function read\_liberty\_cmd
- 6. Now we move to the read\_liberty function in sta.cc which takes me to the libertyreader.cc file
- 7. Here firstly a builder is initialised which then defines the visitor attributes are defined in Opensta. The visitors are the columns in the liberty file which is to be read by opensta.
- 8. The reader initialized in libertyreader then initiates a function readLibertyFile(const char \*filename, bool infer\_latches, Network \*network)



initially declared as a null pointer, and then their values are written into them using the

9.

parseLibertyFile(filename, this, report\_) function in the libertyparser.cc file this finally connects the opensta network to the liberty file. Each cell is defined by using the liberty files

```
139
           for (auto rf_index : RiseFall::rangeIndex()) {
   140 🗸
              have input threshold [rf index] = false;
   141
   142
              have_output_threshold_[rf_index] = false;
   143
              have slew lower threshold [rf index] = false;
   144
              have_slew_upper_threshold_[rf_index] = false;
   145
   146
   147
            //::LibertyParse debug = 1;
           parseLibertyFile(filename, this, report_);
   148
   149
            return library;
   150
10. 151
```

# 5. PROCESSING A VERILOG FILE INTO OPENSTA

- 1. TCL MAIN initiated from the main.cc file. This initiates the tcl interpreter
- 2. The tcl interpreter takes the source tcl file as input from the sourcetclfile from stamain.cc which takes the file from main.cc
- 3. The tcl interpreter starts evaluating the file and uses the file StaAppTCL\_wrap.cxx which is a mapping between tcl functions and opensta functions.
- 4. In the wrapper file the wrap read verilog cmd is initiated
- 5. Initially it checks if the file provided exists and no error is generated.
- 6. Finally the wrapper maps the verilog read function to read\_verilog\_cmd () function which is also in the same wrapper file. It is given the name of the verilog file as input
- 7. the read\_verilog\_cmd function provides an interface for reading a Verilog file and constructing the corresponding network in OpenSTA. It ensures that OpenSTA is properly initialized and performs the necessary setup before calling the appropriate function to read the Verilog file.
- If the network reader object is valid ie opensta has been properly initialised then the network is read onto the opensta app using the readVerilogFile(filename,network) function
- 9. If a netlist is already there then clear the previous data.
- 10. Now the function readverilogfile is initialised which takes me to the verilogreader.cc. New verilogreader is initialised which takes in the input as network.
- 11. This verilog reader sets up the network for the open sta
- 12. Next the read command is used to read the verilog file provided and read it to the network details in opensta

- 13. Here an init function is invoked which reads the file by taking in input as the filename.
- 14. The function then looks for the "Verilog" library in the network. If the library is not found, it creates a new library with the name "Verilog".init function sets up the necessary state and data structures before parsing and reading the Verilog file.

### 6. LINKING A VERILOG FILE TO THE LIBERTY NETWORK

- The command given in the tcl interpreter is analyzed by the inbuilt functions of the interpreter and then it comes to the StaAppTCL\_wrap.cxx. Here it is mapped to the function \_wrap\_link\_design\_cmd
- 2. This function initially checks for any errors and then initiates the link\_design\_cmd function. With the input as the name of the top\_cell which is in this case TOP
- 3. This function in turn calls the linkdesign function which is in the sta.cc file.
- 4. This function firstly clears all the previous data if present and then the linknetwork function is initialized in the concretenetwork.cc function. We move to the link\_verilog\_network function verilogreader.cc and then linknetwork function is initialized
- 5. Finds the module for the top cell in the library file provided. Once it is found then it finds the corresponding Verilog module. It then makes an instance of the top cell in the network. The code then iterates over itself to cover each port of the module and bind the ports of the module to the ports of the library.

# 7. REPORT GENERATION

The command for the generation of the report comes from the Search.tcl file and the cpp function that is called by it using the StaAppTCL\_wrap.cxx is report\_path\_end2

```
report_path_end2(PathEnd *end,

PathEnd *prev_end)

Factor of the second in the second
```

 Takes me to sta.cc which has the function reportPathEnd .this function takes me to reportpath.cc where it is defined • For a full report the reportpath.cc file calls the following functions

```
ReportPath::reportShort(const PathEndCheck *end)
425
426
427
        PathExpanded expanded(end->path(), this);
        reportShort(end, expanded);
428
429
430
      ₩id
431
      ReportPath::reportShort(const PathEndCheck *end,
432
433
            PathExpanded &expanded)
434
        reportStartpoint(end, expanded);
435
436
        reportEndpoint(end);
        reportGroup(end);
437
438
439
      void
440
441
      ReportPath::reportFull(const PathEndCheck *end)
442
      {
        PathExpanded expanded(end->path(), this);
443
        reportShort(end, expanded);
444
445
        reportSrcPathArrival(end, expanded);
446
        reportTgtClk(end);
        reportRequired(end, checkRoleString(end));
447
448
        reportSlack(end);
449
450
```

```
X
 prateek0328@DESKTOP-6DB\ ×
sta::ReportPath::reportFull (this=0x555555d60860, end=0x555556b99bc0)
    at /mnt/c/project/OpenSTA/search/ReportPath.cc:4444
444
          reportShort(end, expanded);
(qdb)
Startpoint: i8 (rising edge-triggered flip-flop clocked by clock)
Endpoint: i12 (rising edge-triggered flip-flop clocked by clock)
Path Group: clock
Path Type: min
445
          reportSrcPathArrival(end, expanded);
(dbp)
  Delay
           Time
                  Description
   0.00
           0.00
                  clock clock (rise edge)
           0.00
                  clock network delay (ideal)
   0.00
   0.00
           0.00 ^ i8/CK (DFF_X1)
           0.08 v i8/Q (DFF_X1)
   0.08
           0.09 ^ i11/ZN (NAND2_X1)
   0.01
   0.00
           0.09 ^ i12/D (DFF_X1)
           0.09
                  data arrival time
446
          reportTgtClk(end);
(gdb)
   0.00
           0.00
                  clock clock (rise edge)
   0.00
           0.00
                  clock network delay (ideal)
   0.00
           0.00
                  clock reconvergence pessimism
           0.00 ^ i12/CK (DFF_X1)
447
          reportRequired(end, checkRoleString(end));
(qdb)
   0.01
                  library hold time
           0.01
           0.01
                  data required time
448
          reportSlack(end);
(gdb)
           0.01
                  data required time
```

 The report full function is used to get the full report for a particular instruction given in the tcl file.

# 8. ARRIVAL INFORMATION FINDING PROCEDURE

#### **OVERVIEW OF THE PROCESS THAT TAKES PLACE:**

 Make graph: This step involves creating a graph data structure that represents the design under consideration. The graph likely represents the logical and physical elements of the design, such as cells, pins, nets, and their interconnections.

- 2. Propagate constants: This step involves propagating constant values throughout the graph. Constants are fixed values that do not change during circuit operation, and propagating them can help simplify subsequent analysis and optimizations.
- 3. Levelize: Levelization is a process in which the graph is partitioned into levels based on the logical and physical dependencies of its elements. This step organizes the graph in a way that allows for efficient analysis and optimization algorithms to be applied.
- 4. Delay calculation: This step involves calculating the delay of paths in the design. The delay represents the time it takes for a signal to propagate from a source to a destination through various components in the circuit. Delay calculation is crucial for determining the timing behavior of the design and identifying critical paths.
- 5. Update generated clocks: This step involves updating information related to generated clocks in the design. Generated clocks are derived clocks that are generated from primary clocks or other clock sources. Updating their information may involve adjusting their timing characteristics or accounting for clock gating and other clock-related optimizations.
- 6. Find arrivals: This step involves finding arrival times for paths in the design. Arrival times represent the times at which signals reach their destinations. By determining the arrival times, it becomes possible to analyze the timing behavior of the design and identify violations or meet specific timing requirements.

```
Overall flow:

make graph

propagate constants

levelize

delay calculation

update generated clocks

find arrivals
```

#### THE PROCEDURE:

A procedure is defined in search.tcl file named find\_timing\_paths\_cmd in line 107. This
procedure takes in the cmd and the name of the variable storing the command line
arguments. The procedure is used to find the timing paths and check for errors and call the
functions to compute the desired timing paths

- This tcl procedure handles the report command passed to it and tells the Opensta app which timing path to analyze. It then calls the appropriate function for calculating the information related to that timing path.
- This tcl command, in the end, calls the \_wrap\_find\_path\_ends function present in the sta\_tcl swig wrapper file. In turn, this function calls the find\_path\_ends function in the same file after selecting the appropriate inputs for the function.
- The find\_path\_ends function takes in a bunch of inputs

```
PathEndSeq |
5350
        find path ends(ExceptionFrom *from,
                 ExceptionThruSeq *thrus,
                 ExceptionTo *to,
                 bool unconstrained,
                 Corner *corner,
                 const MinMaxAll *delay min max,
                 int group_count,
                 int endpoint count,
                 bool unique pins,
                 float slack min,
                 float slack_max,
                 bool sort_by_slack,
                 PathGroupNameSet *groups,
                 bool setup,
                 bool hold,
                 bool recovery,
                 bool removal,
                 bool clk_gating_setup,
                 bool clk gating hold)
```

- This function first calls the cmdLinkedNetwork() function to check whether the network has been linked or not. If the Network is not linked then further execution is not necessary.
- If the network is linked and ready the function then calls the findPathEnds() function in the sta.cc file with all the necessary parameters.
- This function initially calls the searchPreamble() function. This function has 4 major functions:
  - findDelays(): Initially calls the delayCalcPreamble() function which calls the ensureClkNetwork() function. This function is ensuring the Network is already present and linked before finding the Delays.Going into the ensureClkNetwork() function:
    - Has two functions: ensureLevelized() and ensureClkNetwork()
      - ensureLevelized(): before the graph is levelized (graph partitioned into levels based on the logical and physical dependencies of its elements also identifies the critical path)we need to first make a graph, ensure that the graph is Sdc annotated and propagate the constants. ensureGraph() function ensures that the design graph is constructed and available for subsequent operations. The design graph represents the logical and physical connectivity of the design, including cells, pins, nets, and their relationships. If the

graph has not been constructed, this step ensures its creation. Then the ensureGraphSdcAnnotated() This function ensures that the design graph is annotated with information from the SDC (Synopsys Design Constraints) file. The SDC file contains timing and constraint information for the design, such as input delays, clock frequencies, and setup/hold constraints. Annotating the graph with this information is necessary for accurate delay calculation.

**Note**: All output pins are considered constrained because they may be downstream from a set\_min/max\_delay -from that does not have a set\_output\_delay.

Then the ensureConstantsPropogated() function is called and if the constants are not already propagated then this step ensures that constant propagation is performed on the design graph. Constant propagation analyzes the logic paths in the design to identify and propagate constant values (such as 0 or 1) through the logic gates. This information helps determine the values of signals and enables subsequent optimizations and accurate delay calculation.

ensureLevelized function ensures that the design graph is levelized, meaning that the graph's cells and pins are organized into levels or stages based on their logical dependencies. Levelization is essential for performing timing analysis, as it establishes the order in which the cells and pins are evaluated to calculate timing delays. The levelization process identifies the critical paths and helps determine the signal arrival times and required times at each endpoint. This levelization is done in

levelize.cc function which uses DFS to do this.

```
▶ prateek0328@DESKTOP-6DBV × ▶ prateek0328@DESKTOP-6DB\ ×
    at /mnt/c/project/OpenSTA/search/Levelize.hh:37
37
           bool levelized() { return levels_valid_; }
(gdb) bt
   sta::Levelize::levelized (
     this=0x5555557cb8e6 <std::_Vector_base<sta::Vector<sta::Vertex*>, std::a
llocator<sta::Vector<sta::Vertex*> >::_Vector_base()+28>)
    at /mnt/c/project/OpenSTA/search/Levelize.hh:37
#1 0x00005555557c962e in sta::BfsIterator::ensureSize (
    this=0x555555d5a100) at /mnt/c/project/OpenSTA/search/Bfs.cc:58
#2 0x00005555557c960a in sta::BfsIterator::init (this=0x555555d5a100)
#3 0x00005555557c957b in sta::BfsIterator::BfsIterator (
    this=0x555555d5a100, bfs_index=sta::BfsIndex::dcalc, level_min=0, level_max=2147483647, search_pred=0x555555d5a0c0, sta=0x555555d47260)
    at /mnt/c/project/OpenSTA/search/Bfs.cc:44
#4 0x00005555557cacbe in sta::BfsFwdIterator::BfsFwdIterator (
    this=0x555555d5a100, bfs_index=sta::BfsIndex::dcalc,
    at /mnt/c/project/OpenSTA/search/Bfs.cc:326
#5 0x000055555564c3ef in sta::GraphDelayCalc1::GraphDelayCalc1 (
    this=0x555555d59eb0, sta=0x555555d47260)
    at /mnt/c/project/OpenSTA/dcalc/GraphDelayCalc1.cc:222
#6 0x00005555558439d5 in sta::Sta::makeGraphDelayCalc (
    this=0x555555d47260) at /mnt/c/project/OpenSTA/search/Sta.cc:417
    0x0000555555843178 in sta::Sta::makeComponents (this=0x555555d47260) at /mnt/c/project/OpenSTA/search/Sta.cc:292
    0x0000555555592c7e in initStaApp (argc=@0x7ffffffffffffc: 2,
    argv=0x7ffffffffe238, interp=0x555555ce73b0)
at /mr/c/project/OpenSTA/app/Main.cc:175
#9 0x00005555555929fb in staTclAppInit (argc=2, argv=0x7fffffffe238, init_filename=0x55555594074 ".sta", interp=0x55555555ce73b0)
    at /mnt/c/project/OpenSTA/app/Main.cc:125
#10 0x00000555555592997 in tclAppInit (interp=0x555555ce73b0)
    at /mnt/c/project/OpenSTA/app/Main.cc:103
#11 0x00007ffff7f03006 in Tcl_MainEx ()
   from /lib/x86_64-linux-gnu/libtcl8.6.so
#12 0x0000555555592961 in main (argc=2, argv=0x7ffffffffe238)
   at /mnt/c/project/OpenSTA/app/Main.cc:94
(gdb)
```

levelized() is called earlier using this stack trace

- The ensureClkNetwork() function is called from clknetwork.cc
- findClkPins(): If the clock pins are not valid, this function is called to identify and validate the clock pins in the design. The findClkPins function is responsible for traversing the design hierarchy and identifying the primary and generated clock pins.

•

- Now we are done with delayCalcPreamble(). Next in findDelays() we have graph\_delay\_calc\_->findDelays(levelize\_->maxLevel()) By passing the maximum level of levelization as the input parameter, the findDelays function calculates the delays in the design up to that level. This ensures that the delay calculations are performed only on the relevant portions of the design, improving efficiency.
- Checks first that arc\_delay\_calculation is valid. Now if the delays are not seeded ie the initial value for the delays is not written yet then the

- Multidriver nets are found and the rootslews are seeded(GraphDelayCalc1::seedRootSlews() 551). Else:
- Firstly it ensures that the BFS queue is deep enough for the max logic level. Now if incremental mode is on then firstly seedInvalidDelay() function is called.:
  - This function iterates over all the vertices which are marked as invalid and seeds their slew time if they are root vertex or if they are not root vertex then enqueues non-root vertices with non-latch predecessors for further processing during delay calculation
- Now, a visitor object is initialized and visitparallel function is invoked to perform parallel visitation of vertices up to a certain level.
- Now timing check edge delay is done by iterating over all the elements in the invalid\_check\_edge container and using the function findCheckEdgeDelays(). Similarly latch edge delay is calculated for all the latch element connections by iterating over them and using the function findLatchEdgeDelays(). This ends the findDelays function
- updateGeneratedClks(): The basic function of the updateGeneratedClks()
  module in OpenSTA is to update the generated clocks in the design. Generated
  clocks are clocks that are derived from other clocks in the design hierarchy. This
  module ensures that the waveform of each generated clock is valid by generating
  it based on the waveform of its master clock.
- Sdc -> searchPreamble(): it has two functions ensureClkHpinDisables() and ensureClkGroupExclusions(). In conclusion, this module prepares clock configurations by disabling clock sink pins and applying exclusions to the clock group.
  - ensureClkHpinDisables(): This function enables all clock sink pins (hierarchical pins) to be disabled. Disabling the clock sink pins means that the sinks are not considered as sequential elements for clock path analysis. This step is necessary to accurately model the clock tree structure and avoid incorrect timing analysis results.
  - ensureClkGroupExclusions(): This function ensures that the necessary exclusions or exceptions are applied to clock groups. Clock groups are used to define relationships between related clocks in the design, such as synchronous or asynchronous clock domains. By specifying exclusions, certain paths between clocks within the same group can be ignored during timing analysis, improving efficiency and reducing false violations.
- search\_->deleteFilteredArrivals(): This function call deletes any previously filtered
  or pruned arrival times in the search\_ object. It is possible that during previous
  path searches or filtering operations, certain arrival times were excluded or
  marked as invalid. By deleting the filtered arrivals, the path search starts with a

clean slate, considering all valid arrival times for subsequent analysis.

From/thrus/to are used to make a filter exception. If the last search used a filter arrival/required times were only found for a subset of the paths. Delete the paths that have a filter exception state.

- Now we are done with setting up the environment for searching for the path. Now we begin actually searching the path using the findPathEnds() function in the search.cc file.
- We first get the findFilteredArrival() function in findPathEnds().:
  - This function initially deletes the results from the previous time the findPathEnds() was called. Uses the function deletePathGroups()
  - Next it checks all the from thrus and tos to be valid... no value should be empty or invalid.
  - Filtering or finding all arrivals: Depending on the presence of valid from and thrus objects, the function determines whether to perform filtered arrivals or find all arrivals. If either the from the object has associated pins or instances, or the thrus object is not null, the function proceeds with filtered arrivals.
    - findAllArrivals(): Takes in the boolean input thru\_latches which determines if thru\_latches are present or not
    - Firstly it initializes the visitor object which is used to track and update the arrival time of all the signals
    - Then we move into a loop to calculate the arrival time until arrival time at every latch stop changing
    - Inside the loop firstly we enqueue all the pending latches whose arrival time is yet to be calculated
    - Invokes the findArrivals1 function to calculate the arrival times of signals at the specified maximum level. This function performs the actual calculation of arrival times using the topological levelization of the design.
      - findArrivalsSeed(): performs initial setup and seeding of the arrival time calculation
        - genclks\_->ensureInsertionDelays(): Ensures that the insertion delays of generated clocks are properly accounted for.
        - Then it clears any information about arrival time or required arrival time that was present earlier.
        - Calls the seedArrivals(): handles the seeding of arrival times for different types of vertices based on their properties and connectivity within the design. It ensures that the necessary arrival time information is collected and processed during the arrival time calculation.
        - Next all the invalid arrivals are seeded.
      - visitParallel(): function in OpenSTA's BfsIterator class is responsible for performing a parallel visit of vertices in a

- breadth-first search (BFS) manner up to a specified level returns the number of arrivals to be counted
- After the delay is calculated
- Check if the recovery/removal checks are enabled in the SDC (Synopsys Design Constraints) and update the recovery and removal variables accordingly. If the checks are not enabled, both recovery and removal are set to false.
- checks if the gated clock checks are enabled in the SDC and update the clk\_gating\_setup and clk\_gating\_hold variables accordingly. If the checks are not enabled, both clk\_gating\_setup and clk\_gating\_hold are set to false.
- It calls the makePathGroups() function which returns a path group created according to the given constraints
- Then it calls the ensureDownstreamClockPins() function which uses backward BFS and marks all the pins in the level above to have downstream pins and the found downstream pins flag to be true if it founds a downstream pin.
- Next the makePathEnds function generates the path ends for different groups based on the provided constraints and options. It handles sorting, resizing, and the generation of the path ends for unconstrained paths, providing a sequence of path ends ready for further analysis and reporting. Finally these path ends are returned
- Next these path end values are sent to the tcl interpreter and there it calls and does the reporting part

# 9. DELAY CALCULATION IN OPENSTA

In OpenSTA, the calculation of delays involves several steps and functions working together. Here is an overview of how the delay calculation process works using the provided functions:

- 1. \*\*Path Generation\*\*: The process begins with the `findPathEnds` function. It takes various parameters such as `from`, `thrus`, `to`, and other constraints to generate the path ends using the `makePathGroups` and `makePathEnds` functions. These path ends represent the starting and ending points of the paths for which delays need to be calculated.
- 2. \*\*Arrival Time Calculation\*\*: The `findFilteredArrivals` function is called within `findPathEnds` to calculate the arrival times for the generated paths. This function filters the arrivals based on the given exceptions and determines the arrival times using the `findAllArrivals` and `findArrivals1` functions.
- 3. \*\*Required Time Calculation\*\*: Once the arrival times are determined, the required times need to be calculated. The `required\_iter\_->visitParallel` function is called within the `findArrivals1` function to calculate the required times for the paths up to the specified level. This process involves visiting the vertices in a breadth-first manner using multiple threads to efficiently traverse the design graph.

- 4. \*\*Delay Calculation\*\*: With both the arrival times and required times calculated, the delays can be determined. The delay calculation typically involves subtracting the arrival time from the required time to obtain the path delay. This calculation is performed within the `visit` function of the `VertexVisitor` class, which is passed as an argument to the `visitParallel` function.
- 5. \*\*Reporting and Analysis\*\*: Once the delays are calculated, OpenSTA provides various reporting and analysis functions to present the delay information. These functions may include generating delay reports, identifying critical paths, analyzing setup/hold violations, and reporting clock-to-clock maximum cycle warnings, as seen in the `sdc ->reportClkToClkMaxCycleWarnings` function call within `findPathEnds`.

Overall, the delay calculation process in OpenSTA involves generating path ends, calculating arrival times, calculating required times, performing delay calculations, and providing analysis and reporting functionalities. By utilizing these functions and the underlying design data, OpenSTA enables accurate and comprehensive delay analysis of digital designs.

**NOTE:** The delay calculation procedure happens inside the GraphDelayCalc.cc and GraphDelayCalc1.cc class. The function SearchPreamble calls the findDelays function in these classes and the delay calculation procedure is done here.

# 10. CUSTOM DELAY MODEL

I have found 2 ways through which we can implement our own delay calculation model inside OpenSTA.

#### 10.1 Inbuilt API

It is possible to use a custom delay calculation function in OpenSTA to calculate the delay. OpenSTA provides a flexible architecture allowing users to define their delay calculation algorithms by implementing a new class derived from the ArcDelayCalc class.

\*\*We get to know about this api from the StaApi.text file which is in the doc directory. Let's see how we can use a custom delay calculation function in OpenSTA:

- 1. Create a New Delay Calculator Class: Define a new class that is derived from the ArcDelayCalc class. This can be created by either modifying the current class or creating a new class by using this interface. This new class should implement the virtual functions defined in ArcDelayCalc to calculate the gate delay, driver slew, load delays, and load slews for a given timing arc. The specific logic for your custom delay calculation should be written in these functions.
- Register the New Delay Calculator: Register your custom delay calculator with OpenSTA using the registerDelayCalc function. This function allows OpenSTA to recognize your new delay calculator and make it available for use.

- 3. **Set the Custom Delay Calculator:** Use the setArcDelayCalc function in OpenSTA to set your custom delay calculator as the active delay calculator. This function ensures that OpenSTA uses your custom delay calculation logic during the timing analysis.
- 4. **Compile and Link:** After implementing the custom delay calculator, compile your code and link it with OpenSTA to include the new delay calculator in the STA executable.

The interface for the new model class is stored in the include\sta folder inside the ArcDelayCalc.h file

```
46
     class ArcDelayCalc : public StaState
     public:
       explicit ArcDelayCalc(StaState *sta);
       virtual ~ArcDelayCalc() {}
       virtual ArcDelayCalc *copy() = 0;
       // Find the parasitic for drvr pin that is acceptable to the delay
       // calculator by probing parasitics_.
       virtual Parasitic *findParasitic(const Pin *drvr_pin,
                const RiseFall *rf,
                const DcalcAnalysisPt *dcalc_ap) = 0;
       virtual ReducedParasiticType reducedParasiticType() const = 0;
       // This call primarily initializes the load delay/slew iterator.
       virtual void inputPortDelay(const Pin *port_pin,
                 float in slew,
                 const RiseFall *rf,
                 const Parasitic *parasitic,
                 const DcalcAnalysisPt *dcalc ap) = 0;
        // Find the delay and slew for arc driving drvr pin.
       virtual void gateDelay(const LibertyCell *drvr_cell,
```

```
// Find the delay and slew for arc driving drvr pin.
67
       virtual void gateDelay(const LibertyCell *drvr cell,
            const TimingArc *arc,
70
            const Slew &in slew,
            // Pass in load cap or drvr parasitic.
71
72
            float load cap,
            const Parasitic *drvr_parasitic,
73
74
            float related_out_cap,
75
            const Pvt *pvt,
            const DcalcAnalysisPt *dcalc_ap,
76
            // Return values.
            ArcDelay &gate delay,
78
79
            Slew &drvr slew) = 0;
       // Find the wire delay and load slew of a load pin.
80
81
       // Called after inputPortDelay or gateDelay.
       virtual void loadDelay(const Pin *load_pin,
82
            // Return values.
83
84
            ArcDelay &wire delay,
85
            Slew &load slew) = 0;
       virtual void setMultiDrvrSlewFactor(float factor) = 0;
       // Ceff for parasitics with pi models.
87
       virtual float ceff(const LibertyCell *drvr cell,
89
              const TimingArc *arc,
```

```
const Slew &in slew,
               float load cap,
               const Parasitic *drvr_parasitic,
               float related_out_cap,
               const Pvt *pvt,
               const DcalcAnalysisPt *dcalc_ap) = 0;
        // Find the delay for a timing check arc given the arc's
        // from/clock, to/data slews and related output pin parasitic.
        virtual void checkDelay(const LibertyCell *drvr_cell,
              const TimingArc *arc,
101
              const Slew &from_slew,
102
              const Slew &to_slew,
103
              float related out cap,
104
              const Pvt *pvt,
              const DcalcAnalysisPt *dcalc_ap,
105
106
              // Return values.
107
              ArcDelay &margin) = 0;
        // Report delay and slew calculation.
108
109
        virtual string reportGateDelay(const LibertyCell *drvr cell,
110
                                        const TimingArc *arc,
111
                                        const Slew &in_slew,
112
                                        // Pass in load cap or drvr parasitic.
                                        float load cap,
113
```

```
113
                                       float load cap,
                                       const Parasitic *drvr parasitic,
114
115
                                       float related out cap,
116
                                       const Pvt *pvt,
                                       const DcalcAnalysisPt *dcalc ap,
117
118
                                       int digits) = 0;
119
        // Report timing check delay calculation.
120
        virtual string reportCheckDelay(const LibertyCell *cell,
121
                                        const TimingArc *arc,
                                        const Slew &from slew,
122
                                        const char *from slew annotation,
123
124
                                        const Slew &to slew,
125
                                        float related out cap,
126
                                        const Pvt *pvt,
127
                                        const DcalcAnalysisPt *dcalc_ap,
128
                                        int digits) = 0;
129
        virtual void finishDrvrPin() = 0;
130
131
      protected:
        GateTimingModel *gateModel(const TimingArc *arc,
                 const DcalcAnalysisPt *dcalc ap) const;
133
        CheckTimingModel *checkModel(const TimingArc *arc,
134
135
                   const DcalcAnalysisPt *dcalc ap) const;
        TimingModel *model(const TimingArc *arc,
          TimingModel *model(const TimingArc *arc,
136
                   const DcalcAnalysisPt *dcalc_ap) const;
137
138
        };
139
        } // namespace
140
141
```

Register Dealy calculator function is in the dcalc directory in the Delaycalc.cc file

```
namespace sta {
28
29
    typedef Map<const char*, MakeArcDelayCalc, CharPtrLess> DelayCalcMap;
30
31
    static DelayCalcMap *delay_calcs = nullptr;
32
33
34
    registerDelayCalcs()
35
36
      registerDelayCalc("unit", makeUnitDelayCalc);
37
      registerDelayCalc("lumped cap", makeLumpedCapDelayCalc);
38
      registerDelayCalc("slew_degrade", makeSlewDegradeDelayCalc);
39
      registerDelayCalc("dmp ceff elmore", makeDmpCeffElmoreDelayCalc);
40
      registerDelayCalc("dmp_ceff_two_pole", makeDmpCeffTwoPoleDelayCalc);
41
      registerDelayCalc("arnoldi", makeArnoldiDelayCalc);
42
43
44
45
    registerDelayCalc(const char *name,
46
          MakeArcDelayCalc maker)
47
48
      if (delay calcs == nullptr)
49
        delay_calcs = new DelayCalcMap;
44
        void
        registerDelayCalc(const char *name,
45
                 MakeArcDelayCalc maker)
46
47
           if (delay_calcs == nullptr)
48
              delay_calcs = new DelayCalcMap;
49
           (*delay calcs)[name] = maker;
50
51
```

The setArcDelayCalc function is defined in sta.cc file in the search directory.

```
3325
       void
       Sta::setArcDelayCalc(const char *delay_calc_name)
3326
3327
3328
         delete arc delay calc ;
3329
         arc delay calc = makeDelayCalc(delay calc name, sta );
         // Update pointers to arc delay calc.
3330
3331
         updateComponentsState();
3332
         graph_delay_calc_->delaysInvalid();
         search ->arrivalsInvalid();
```

#### 10.2 FUNCTION IN PLACE OF DELAY CALCULATION FUNCTION

The delay calculation takes place in openSTA by using a delay calculation model. All the delay calculation models are derived from the main parent class ArcDelayCalc where the abstract class is defined to decide how each of the models will be designed. This is the hierarchy of the net delay models that are used in OpenSTA

```
34
     // Delay calculator class hierarchy.
35
     // ArcDelayCalc
     // UnitDelayCalc
        LumpedCapDelayCalc
37
         RCDelayCalc
           SlewDegradeDelayCalc
         DmpCeffDelayCalc
41
             DmpCeffElmoreDelayCalc
42
             DmpCeffTwoPoleDelayCalc
43
            ArnoldiDelayCalc
44
```

But we need to find the cell delay. For that, I found the TimingDelay model in OpenSTA. This model is defined in The file TableModel.cc in the liberty folder.

- This file has the definition for the GateTable Model. This model is derived from the ArcDelay class and hence it defines its virtual functions.
- The function which is used to calculate the gate delay is gateDelay() defined from line 97.

The hierarchy before the function call is shown below:

```
### protection forwar x | protection forwar
```

 Now inside this function the findValue function is used to pick up the delay from the liberty files using the slew and the load capacitance

```
GateTableModel::gateDelay(const LibertyCell *cell,
              const Pvt *pvt,
              float in_slew,
              float load_cap,
              float related out cap,
              bool pocv_enabled,
              // return values
              ArcDelay &gate_delay,
              Slew &drvr slew) const
        const LibertyLibrary *library = cell->libertyLibrary();
        float delay = findValue(library, cell, pvt, delay_model_, in_slew,
              load cap, related out cap);
110
        float sigma early = 0.0;
111
        float sigma late = 0.0;
112
        if (pocv enabled && delay sigma models [EarlyLate::earlyIndex()])
113
          sigma early = findValue(library, cell, pvt,
114
                delay_sigma_models_[EarlyLate::earlyIndex()],
                in_slew, load_cap, related_out_cap);
115
116
        if (pocv_enabled && delay_sigma_models_[EarlyLate::lateIndex()])
117
          sigma_late = findValue(library, cell, pvt,
               delay_sigma_models_[EarlyLate::lateIndex()],
118
```

```
113
          sigma_early = findValue(library, cell, pvt,
114
                delay sigma models [EarlyLate::earlyIndex()],
                in slew, load cap, related out cap);
        if (pocv enabled && delay sigma models [EarlyLate::lateIndex()])
116
          sigma late = findValue(library, cell, pvt,
               delay_sigma_models_[EarlyLate::lateIndex()],
119
               in_slew, load_cap, related_out_cap);
        gate delay = makeDelay(delay, sigma early, sigma late);
121
        float slew = findValue(library, cell, pvt, slew_model_, in_slew,
122
123
             load_cap, related_out_cap);
        if (pocv enabled && slew sigma models [EarlyLate::earlyIndex()])
          sigma_early = findValue(library, cell, pvt,
                slew_sigma_models_[EarlyLate::earlyIndex()],
                in slew, load cap, related out cap);
        if (pocv_enabled && slew_sigma_models_[EarlyLate::lateIndex()])
128
129
          sigma late = findValue(library, cell, pvt,
130
               slew_sigma_models_[EarlyLate::lateIndex()],
               in slew, load cap, related out cap);
        // Clip negative slews to zero.
        if (slew < 0.0)
          slew = 0.0;
        drvr slew = makeDelay(slew, sigma early, sigma late);
```

- We make changes at this location to set our own model. We can have a flag variable
  inside this class and use that flag variable to call the CustomDelayFunction when the
  flag is high. The custom model will take slew and loadCap as input and can show the
  output when that custom model is used. This flag variable will be called from the input tol
  file (sta.tcl in this case).
- Changes made in the code:
  - Firstly we must set the variable through the sta.tcl file

```
OpenSTA > \( \) \( \) sta.tcl \\

2 \quad \text{global use_custom_delay_model 1} \\
4 \quad \text{read_liberty example_typ.lib} \\
5 \quad \text{read_verilog TOP.v} \\
6 \quad \text{link_design TOP} \\
7 \quad \text{read_sdc q3.sdc} \\
9 \quad \text{report_checks -path_delay min_max} \\
10 \quad \text{#report_checks -through [get_ports IN1] -path_delay min_max} \\
11 \quad \text{#q9-> report_checks -to [get_ports OUT] -path_delay min_max > q10_reports_1.txt} \\
12 \quad \text{#q12-> report_checks -from i9 -to i12 -path_delay min_max > q12_reports_1.txt} \\
13 \\
14
```

Now all the changes made are in the TableModel.cc file:

Include Sta.hh and tcl.h

```
#include "Sta.hh"
#include "tcl.h"
#include "TableModel.hh"
##include "TableModel.hh"
```

Inside the gateDelay function get the value of the flag from the TCL interpreter and use it to decide which model to use.

```
GateTableModel::gateDelay(const LibertyCell *cell,
       const Pvt *pvt,
       float in_slew,
       float load_cap,
       float related_out_cap,
       bool pocv_enabled,
       ArcDelay &gate_delay,
       Slew &drvr_slew) const
 const LibertyLibrary *library = cell->libertyLibrary();
 float delay = findValue(library, cell, pvt, delay_model_, in_slew,
      load_cap, related_out_cap);
  const char* flagValue = Tcl_GetVar( Sta::sta()->tclInterp(), "use_custom_delay_model", TCL_GLOBAL_ONLY);
  int globalMyDelay= strcmp(flagValue, "1");
  if (!globalMyDelay){
  delay = getDelayFromMyModel(in_slew, load_cap);
  float sigma_early = 0.0;
```

 Create the function which will hold the custom delay function and return the delay by taking input as in slew and load cap.

```
float getDelayFromMyModel(float in_slew, float load_cap){

// USE THIS FUNCTION TO INPUT THE CUSTOM DELAY FUNCTION

return 1; //remove this after the function is completed

}

void

GateTableModel::gateDelay(const LibertyCell *cell,

const Pvt *pvt,

float in_slew,

float load_cap,
```

• We test the code first with the flag as zero

```
OpenSTA > \( \) sta.tcl

2    global use_custom_delay_model

3    set use_custom_delay_model 0 |

4    read_liberty example_typ.lib

5    read_verilog TOP.v

6    link_design TOP

7    read_sdc q3.sdc

8

9    report_checks -path_delay min_max

10    #report_checks -through [get_ports IN1] -path_delay min_max

11    #q9-> report_checks -to [get_ports OUT] -path_delay min_max > q10_reports_1.txt

12    #q12-> report_checks -from i9 -to i12 -path_delay min_max > q12_reports_1.txt
```

This is free software, and you are free to change and redistribute it

under certain conditions; type `show\_copying' for details.

This program comes with ABSOLUTELY NO WARRANTY; for details type `show\_warranty'.

Startpoint: i8 (rising edge-triggered flip-flop clocked by clock) Endpoint: i12 (rising edge-triggered flip-flop clocked by clock)

Path Group: clock Path Type: min

| Delay | Time   | Description                   |
|-------|--------|-------------------------------|
| 0.00  |        | clock clock (rise edge)       |
| 0.00  | 0.00   | clock network delay (ideal)   |
| 0.00  | 0.00 ^ | i8/CK (DFF_X1)                |
| 0.08  | 0.08 v | i8/Q (DFF_X1)                 |
| 0.01  | 0.09 ^ | i11/ZN (NAND2_X1)             |
| 0.00  | 0.09 ^ | i12/D (DFF_X1)                |
|       | 0.09   | data arrival time             |
|       |        |                               |
| 0.00  | 0.00   | clock clock (rise edge)       |
| 0.00  | 0.00   | clock network delay (ideal)   |
| 0.00  | 0.00   | clock reconvergence pessimism |
|       | 0.00 ^ | i12/CK (DFF_X1)               |
| 0.01  | 0.01   | library hold time             |
|       | 0.01   | data required time            |
|       |        |                               |
|       | 0.01   | data required time            |
|       | -0.09  |                               |
|       |        |                               |
|       | 0.09   | slack (MET)                   |

Startpoint: IN1 (input port clocked by clock)

Endpoint: i8 (rising edge-triggered flip-flop clocked by clock)

Path Group: clock Path Type: max

| Delay | Time    | Description                   |
|-------|---------|-------------------------------|
| 0.00  | 0.00    | clock clock (rise edge)       |
| 0.00  | 0.00    | clock network delay (ideal)   |
| 1.50  | 1.50 v  | input external delay          |
| 0.00  | 1.50 v  | IN1 (in)                      |
| 0.00  | 1.50 v  | i8/D (DFF_X1)                 |
|       | 1.50    | data arrival time             |
|       |         |                               |
| 10.00 | 10.00   | clock clock (rise edge)       |
| 0.00  | 10.00   | clock network delay (ideal)   |
| 0.00  | 10.00   | clock reconvergence pessimism |
|       | 10.00 ^ | i8/CK (DFF_X1)                |
| -0.07 | 9.93    | library setup time            |
|       | 9.93    | data required time            |
|       | 9.93    | data required time            |
|       | -1.50   | •                             |
|       |         |                               |
|       | 8.43    | slack (MET)                   |

• Next when the flag is high

```
OpenSTA > \begin{align*} \stack \text{ \text
```

Startpoint: i8 (rising edge-triggered flip-flop clocked by clock) Endpoint: i12 (rising edge-triggered flip-flop clocked by clock)

Path Group: clock Path Type: min

| Delay    | Time        | Description                    |
|----------|-------------|--------------------------------|
| 0.00     | 0.00        | clock clock (rise edge)        |
| 0.00     | 0.00        | clock network delay (ideal)    |
| 0.00     | 0.00 ^      | i8/CK (DFF_X1)                 |
| 10000000 | 000.00 1000 | 0000000.00 v i8/Q (DFF_X1)     |
| 10000000 | 900.00 2000 | 0000000.00 ^ i11/ZN (NAND2_X1) |
| 0.00     | 2000000000  | 0.00 ^ i12/D (DFF_X1)          |
|          | 2000000000  | 0.00 data arrival time         |
|          |             |                                |
| 0.00     | 0.00        | clock clock (rise edge)        |
| 0.00     | 0.00        | clock network delay (ideal)    |
| 0.00     | 0.00        | clock reconvergence pessimism  |
|          | 0.00 ^      | i12/CK (DFF_X1)                |
| 0.01     | 0.01        | library hold time              |
|          | 0.01        | data required time             |
|          | 0.01        | data required time             |
|          |             | 00.00 data arrival time        |
|          | 2000000000  | <br>0.00 slack (MET)           |

Startpoint: i9 (rising edge-triggered flip-flop clocked by clock) Endpoint: i12 (rising edge-triggered flip-flop clocked by clock)

Path Group: clock Path Type: max

| Delay<br> | Time        | Description                    |
|-----------|-------------|--------------------------------|
| 0.00      | 0.00        | clock clock (rise edge)        |
| 0.00      | 0.00        | clock network delay (ideal)    |
| 0.00      | 0.00 ^      | i9/CK (DFF_X1)                 |
| 10000000  | 000.00 1000 | 0000000.00 ^ i9/Q (DFF_X1)     |
| 10000000  | 000.00 2000 | 0000000.00 v i10/ZN (INV_X4)   |
| 10000000  | 000.00 3000 | 0000000.00 ^ i11/ZN (NAND2_X1) |
| 0.00      | 3000000000  | 0.00 ^ i12/D (DFF_X1)          |
|           | 3000000000  | 0.00 data arrival time         |
|           |             |                                |
| 10.00     | 10.00       | clock clock (rise edge)        |
| 0.00      | 10.00       | clock network delay (ideal)    |
| 0.00      | 10.00       | clock reconvergence pessimism  |
|           | 10.00 ^     | i12/CK (DFF_X1)                |
| -0.03     | 9.97        | library setup time             |
|           | 9.97        | data required time             |
|           | 9 97        | data required time             |
|           |             | 00.00 data arrival time        |
|           |             |                                |
|           | _300000000  | 00 00 slack (VTOLATED)         |